When using a dataset with a huge ariety of countries for instance, parameters are used to make navigating this dataset easier. Using parameters, it is easier to choose exactly what you want to visualise.In this report I will be visualizing COVID-19 data obtained from ECDC about daily numbers of newly reported COVID-19 cases and deaths in EU countries.).
The parameters I will be using are country, year and month. Using these parameters, I will generate a interactive plot for the newly reported number of cases and one for the newly reported number of deaths.I will be focusing on the Netherlands, as well as 2 neighbouring countries in the year 2021, october to december. The reason for this is that I was infected with COVID during this period, and was curious to see how many others had the unfortunate same fate.
The parameters are chosen when rendering the report, and are set as seen below.
#generate the report with chosen parameters
rmarkdown::render("06_parameterizedcovid.Rmd", params = list(country = c("Netherlands", "Belgium", "Germany"),
year = 2021,
month = 10:12))
For the parameterization to work, the YAML header of the markdown file needs to contain some defaults for the parameters that are used when the parameter is not specified when rendering.
---
params:
country: "Netherlands"
year: 2022
month: 10
---
library(tidyverse)
library(plotly)
library(scales)
library(here)
data <- read.csv(here("raw_data/data.csv")) #load the data
After the data is loaded, it can be filtered to use the given parameters.
data_filtered <- data %>% filter(countriesAndTerritories %in% params$country,
year %in% params$year,
month %in% params$month) #filter for the given parameters
data_filtered <- mutate(data_filtered, "date" = paste(day, month, year, sep="/")) #add date column
data_filtered$date <- as.Date(data_filtered$date, format="%d/%m/%Y") #change the data type to date
Finally, the data can be plotted:
plot_cases <- ggplot(data_filtered,
aes(x = date, y = cases, group = countriesAndTerritories,
color = countriesAndTerritories))+
geom_line()+
geom_point(size = 1)+
labs(title = "Number of newly reported COVID-19 cases over time by country",
y = "Number of COVID-19 cases",
x = "Date",
color = "Country")+
scale_x_date(date_breaks = datebreaks, date_labels = "%d-%m-%y")+
scale_y_continuous(labels = label_comma())+
theme_minimal()+
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1))
ggplotly(plot_cases) #generate an interactive plot showing the number of cases
Number of newly reported COVID-19 cases over time by country, with on the y-axis the number of COVID-19 cases and on the x-axis the date.
plot_deaths <- ggplot(data_filtered,
aes(x = date, y = deaths, group = countriesAndTerritories,
color = countriesAndTerritories))+geom_line()+
geom_point(size = 1)+
labs(title = "Number of newly reported COVID-19 deaths over time by country",
y = "Number of COVID-19 deaths",
x = "Date",
color = "Country")+
scale_x_date(date_breaks = datebreaks, date_labels = "%d-%m-%y")+
scale_y_continuous(labels = label_comma())+
theme_minimal()+
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1))
ggplotly(plot_deaths) #generate an interactive plot showing the number of deaths
Number of newly reported COVID-19 deaths over time by country, with on the y-axis the number of COVID-19 deaths and on the x-axis the date.
Because the report is parameterized, it’s simple to recreate the graph for different countries and time periods:
#generate the plots with different parameters
rmarkdown::render("06_parameterizedcovid.Rmd", params = list(country = c("Estonia", "Latvia", "Lithuania"),
year = 2020:2021,
month = 1:12))
Number of newly reported COVID-19 cases over time by country, with on the y-axis the number of COVID-19 cases and on the x-axis the date, this time using different parameters.
Number of newly reported COVID-19 deaths over time by country, with on the y-axis the number of COVID-19 deaths and on the x-axis the date, this time using different parameters.